174 research outputs found

    06431 Abstracts Collection -- Scalable Data Management in Evolving Networks

    Get PDF
    From 22.10.06 to 27.10.06, the Dagstuhl Seminar 06431 ``Scalable Data Management in Evolving Networks\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    Dynamic Clustering in Object-Oriented Databases: An Advocacy for Simplicity

    Get PDF
    International audienceWe present in this paper three dynamic clustering techniques for Object-Oriented Databases (OODBs). The first two, Dynamic, Statistical & Tunable Clustering (DSTC) and StatClust, exploit both comprehensive usage statistics and the inter-object reference graph. They are quite elaborate. However, they are also complex to implement and induce a high overhead. The third clustering technique, called Detection & Reclustering of Objects (DRO), is based on the same principles, but is much simpler to implement. These three clustering algorithm have been implemented in the Texas persistent object store and compared in terms of clustering efficiency (i.e., overall performance increase) and overhead using the Object Clustering Benchmark (OCB). The results obtained showed that DRO induced a lighter overhead while still achieving better overall performance

    Managing Data Replication in Mobile Ad-Hoc Network Databases

    Full text link
    A Mobile Ad-hoc Network (MANET) is a collection of wireless autonomous nodes without any fixed backbone infrastructure. All the nodes in MANET are mobile and power restricted and thus, disconnection and network partitioning occur frequently. In addition, many MANET database transactions have time constraints. In this paper, a Data REplication technique for real-time Ad-hoc Mobile databases (DREAM) is proposed that addresses all those issues. It improves data accessibility while considering the issue of energy limitation by replicating hot data items at servers that have higher remaining power. It addresses disconnection and network partitioning by introducing new data and transaction types and by considering the stability of wireless link. It handles the real-time transaction issue by replicating data items that are accessed frequently by firm transactions before those accessed frequently by soft transactions. DREAM is prototyped on laptops and PDAs and compared with two existing replication techniques using a military database application. The results show that DREAM performs the best in terms of percentage of successfully executed transactions, servers’ and clients’ energy consumption, and balance of energy consumption distribution among servers

    A Vision of a Decisional Model for Re-optimizing Query Execution Plans Based on Machine Learning Techniques

    Get PDF
    International audienceMany of the existing cloud database query optimization algorithms target reducing the monetary cost paid to cloud service providers in addition to query response time. These query optimization algorithms rely on an accurate cost estimation so that the optimal query execution plan (QEP) is selected. The cloud environment is dynamic, meaning the hardware configuration, data usage, and workload allocations are continuously changing. These dynamic changes make an accurate query cost estimation difficult to obtain. Concurrently, the query execution plan must be adjusted automatically to address these changes. In order to optimize the QEP with a more accurate cost estimation, the query needs to be optimized multiple times during execution. On top of this, the most updated estimation should be used for each optimization. However, issues arise when deciding to pause the execution for minimum overhead. In this paper, we present our vision of a method that uses machine learning techniques to predict the best timings for optimization during execution

    P-LUPOSDATE: Using Precomputed Bloom Filters to Speed Up SPARQL Processing in the Cloud

    Get PDF
    Increasingly data on the Web is stored in the form of Semantic Web data. Because of today's information overload, it becomes very important to store and query these big datasets in a scalable way and hence in a distributed fashion. Cloud Computing offers such a distributed environment with dynamic reallocation of computing and storing resources based on needs. In this work we introduce a scalable distributed Semantic Web database in the Cloud. In order to reduce the number of (unnecessary) intermediate results early, we apply bloom filters. Instead of computing bloom filters, a time-consuming task during query processing as it has been done traditionally, we precompute the bloom filters as much as possible and store them in the indices besides the data. The experimental results with data sets up to 1 billion triples show that our approach speeds up query processing significantly and sometimes even reduces the processing time to less than half

    High-Performance Spatial Query Processing on Big Taxi Trip Data Using GPGPUs

    Full text link
    Abstract — City-wide GPS recorded taxi trip data contains rich information for traffic and travel analysis to facilitate transportation planning and urban studies. However, traditional data management techniques are largely incapable of processing big taxi trip data at the scale of hundreds of millions. In this study, we aim at utilizing the General Purpose computing on Graphics Processing Units (GPGPUs) technologies to speed up processing complex spatial queries on big taxi data on inexpensive commodity GPUs. By using the land use types of tax lot polygons as a proxy for trip purposes at the pickup and drop-off locations, we formulate a taxi trip data analysis problem as a large-scale nearest neighbor spatial query problem based on point-to-polygon distance. Experiments on nearly 170 million taxi trips in the New York City (NYC) in 2009 and 735,488 tax lot polygons with 4,698,986 vertices have demonstrated the efficiency of the proposed techniques: the GPU implementations is about 10-20X faster than the host system and complete the spatial query in about a minute. We further discuss several interesting patterns discovered from the query results which warrant further study. The proposed approach can be an interesting alternative to traditional MapReduce/Hadoop based approaches to processing big data with respect to performance and cost

    SLA-Aware Cloud Query Processing with Reinforcement Learning-based Multi-Objective Re-Optimization

    Get PDF
    International audienceQuery processing on cloud database systems is a challenging problem due to the dynamic cloud environment. In cloud database systems, besides query execution time, users also consider the monetary cost to be paid to the cloud provider for executing queries. Moreover, a Service Level Agreement (SLA) is signed between users and cloud providers before any service is provided. Thus, from the profit-oriented perspective for the cloud providers, query re-optimization is multi-objective optimization that minimizes not only query execution time and monetary cost but also SLA violations. In this paper, we introduce ReOptRL and SLAReOptRL, two novel query re-optimization algorithms based on deep reinforcement learning. Experiments show that both algorithms improve query execution time and query execution monetary cost by 50% over existing algorithms, and SLAReOptRL has the lowest SLA violation rate among all the algorithms

    High-performance online spatial and temporal aggregations on multi-core CPUs and many-core GPUs, in:

    Get PDF
    a b s t r a c t With the increasing availability of locating and navigation technologies on portable wireless devices, huge amounts of location data are being captured at ever growing rates. Spatial and temporal aggregations in an Online Analytical Processing (OLAP) setting for the large-scale ubiquitous urban sensing data play an important role in understanding urban dynamics and facilitating decision making. Unfortunately, existing spatial, temporal and spatiotemporal OLAP techniques are mostly based on traditional computing frameworks, i.e., disk-resident systems on uniprocessors based on serial algorithms, which makes them incapable of handling largescale data on parallel hardware architectures that have already been equipped with commodity computers. In this study, we report our designs, implementations and experiments on developing a data management platform and a set of parallel techniques to support highperformance online spatial and temporal aggregations on multi-core CPUs and many-core Graphics Processing Units (GPUs). Our experiment results show that we are able to spatially associate nearly 170 million taxi pickup location points with their nearest street segments among 147,011 candidates in about 5-25 s on both an Nvidia Quadro 6000 GPU device and dual Intel Xeon E5405 quad-core CPUs when their Vector Processing Units (VPUs) are utilized for computing intensive tasks. After spatially associating points with road segments, spatial, temporal and spatiotemporal aggregations are reduced to relational aggregations and can be processed in the order of a fraction of a second on both GPUs and multi-core CPUs. In addition to demonstrating the feasibility of building a high-performance OLAP system for processing large-scale taxi trip data for real-time, interactive data explorations, our work also opens the paths to achieving even higher OLAP query efficiency for large-scale applications through integrating domain-specific data management platforms, novel parallel data structures and algorithm designs, and hardware architecture friendly implementations

    Performance Comparison of Scheduling Techniques to Manage Transactions for Real-Time Mobile Databases in Ad Hoc Networks

    Get PDF
    A Mobile Ad-hoc Network (MANET) ¡s an autcnomous system of mobile hosts (MHs) with similar transmission power and computation capabilities that communicate over relatively bandwidth constrained wireless links. Applications such as emergency/rescue operations, conferences/meetings/lectures, dísaster refief efforts, bluetooth (Personal Area Network} and military networks can be conceived as applications of MAIMET due to the fact that they cannot rely on centralized and organized connectivity. In these environment transactions are time-crltical and require to be executed not only correclly but also within their deadlines, that is, the user that submit a transaction would like it to be completed before a certain time in the future. This study focuses on the comparison of four scheduling techniques based on the policy of assigning priorities to transactions on the system. The techniques are: First Come First Serve (FCFS) [1,2], Earliest Deadline (ED) [1,2,5], Least Slack (LS) [1,2,8] and Least Slack Mobile (LSM) proposed in [3] where some modifications to the Least Slack Technique with respect to energy constraints, disoonnection and transaction type (firnVsoft) are considered. Applying these modifications to Earliest Deadline, the performance of the system will be evaluated to measure the percentage of transaction missing deadlines and the total energy consumption in the mobile hosts. The performance evaluation of the techniques will be carried out by means of simulation. The simulation model is implemented using Visual Slam/Awesim [7]

    A Prototype for Translating XQuery Expressions into XSLT Stylesheets

    Get PDF
    Abstract. The need for a user-friendly query language becomes increasingly important since the introduction of XML. The W3C developed XQuery for the purpose of querying XML data, but XQuery is not available in every tool. Because of historical reasons, many tools only support processing XSLT stylesheets. It is desirable to use tools with XQuery, the design goals of which are, among other goals, to be more human readable and to be less error-prone than XSLT. Instead of implementing XQuery support for every tool, we propose to use an XQuery to XSLT translator. Following this idea, XQuery will be available for all tools, which currently support XSLT stylesheets. In this paper, we propose a translator which transforms XQuery expressions into XSLT stylesheets and we analyze the performance of the translation and XSLT processing in comparison to native XQuery processing
    • …
    corecore